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Abstract 

For a positive number q the Mallows measure on the symmetric group is the probability measure 
on Sn such that Pn,q{n) is proportional to g-to-the-power-inv(7r) where inv(7r) equals the number 
of inversions: inv(7r) equals the number of pairs i < j such that TTi > ttj . One may consider this 
as a mean-field model from statistical mechanics. The weak large deviation principle may replace 
the Gibbs variational principle for characterizing equilibrium measures. In this sense, we prove 
absence of phase transition, i.e., phase uniqueness. 

1 Introduction 

The Mallows measure on permutations is a non-uniform measure which may be motivated in various 
ways. It arises in non-parametric statistics m 

n— 1 n 

yireSn, Pn.g(7r) = where inv(7r) ^ ^ l(-oo,o)(7rj - tt^) , 

i—1 j—i^l 


for some parameter q € (0,oo). 

A good recent review of several important examples of non-uniform measures is |21j . In addition, in 
that reference, Mukherjee considered thermodynamic limits, and he derived large deviation principles. 
That is the starting point for us. 

The Mallows model has also been studied for other reasons. Diaconis and Ram showed a connection 
to the Hecke algebra [9]. The Mallows measure is a measure on permutations. But it is also closely 
related to the “blocking measure” which are invariant measures of the asymmetric exclusion process 
(ASEP). In fact the ASEP may be viewed as a projection of a biased card shuffling algorithm introduced 
by Diaconis and Ram. This was exploited by Benjamini, Berger, Hoffman and Mossel [S], using David 
Bruce Wilson’s height functions [35] to bound the mixing time for the card shuffling model, starting 
from the mixing time for the ASEP. 

At a simpler level, one may try to use information about the ASEP invariant measures to gain 
information about the Mallows measure. The ASEP invariant measures are well-known. See, for 
example, Chapter VIII, Section 5 of m- This is a well-known approach, following Wilson |32] . 

In the present note, we consider the large deviation principle for a continuous version of the Mallows 
model, on ([0,1]^)" such that 


dHn,l3iixi,yi),...,{Xn,yn)) = ZniP) ^6 ((a;i ,yi)... ,y„)) ^ 

^ n —1 n 

Hn{{Xl,yi),...,{Xn,ynj) = EE h{{xi,Xj),{yi,yj)) (1) 

i—1 j—i-\-l 

h{ixi,yi),{x2,y2)) = l(_oo,o)((a;i - a;2)(2/i - 2/2)) • 
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Mukherjee found the large deviation rate function Xp : Af+,i([0,1]^) for the empirical measure of 
{[xi,yi),...,{xn,yn)) on [0,1]2 

1 ^ 
k=l 

We start with his formula for the rate function, and we show that there is a unique optimizer. There 
is a straightforward connection between y,n,i 3 and Pn,q- Therefore, this gives a direct probabilistic 
method to find the weak limit law for 

1.1 Discussion of proof technique and relation to known resnlts 

Following the approach suggested by the height functions, we consider the 4-square problem. Given 9 S 
(0,1) define Li{9) = [0, 9] and L2{9) = (0,1]. Given 0i, 02 S (0,1) define Ay (0i, 02) = Ti(0i) x ^2(02) 
in [0,1]^. Then, finally, given tn, ti2, ^21,^22 > 0 such that tn -|- ti2 -I- ^21 + ^22 = 1, define 

G At+p([0,1]^) : Vf,jG{l,2}, i/(Ay (01,02)) = ty} . 

We give an explicit formula for X^(Weye2(tii,ti2,<21,^22))- Intuitively, this is the simplest possible 
problem one can consider, starting from Mukherjee’s formula for Xp. Moreover, for each (0i,02) G 
(0,1)^, there is a unique choice of t*^ (0i, 02) maximizing this formula. Moreover, 

Rp{9i,92) = til (^ 2 ,02), 

defined for each 0i,02 G [0,1] (and extended continuously at the boundary) corresponds to the joint 
cumulative distribution function of a measure pp G A4+,i([0,1]^): pp{^,9i\ x [0,02]) = Rp[9i,92)- 
As a corollary, elementary results imply that (x y )) converge in distribution to the 

non-random measure pp, when for each n we have ((xi, j/i),..., (xn, j/„)) distributed according to 
pp. Then, due to the connection to the Mallows measure, the same result holds when applied to 
m|"ji y if, for each n we define Xk = k/n and yk = TTfe/n for fc = 1 ,..., n, where we select a 

random permutation tt G 5'„, distributed according to Pn.qni long as (qi,92, ■ • ■) is a sequence such 
that lim„^.oo n(l — g„) = /3. Note that /3 G K is fixed. (For /3 < 0 this means that is slightly greater 
than 1 instead of less than 1, as it would be for /3 > 0.) So this result also gives a simpler, more direct 
proof of an old result from m, which had previously been proved by an obscure method. 

1.1.1 The Mallows model is a frustration free, mean-field model 

The Hamiltonian in © is a mean-field Hamiltonian. Mean-field Hamiltonians have the property 
that considering a subsystem, the inverse-temperature /3 needs to be rescaled because of the explicit 
dependence of on n. The consideration of a sub-system is sometimes known as the cavity method 
for complicated problems, which are most amenable to inductive analysis, removing one particle at 
a time. See for instance [18] as an indication of the physics approach to this method or [28] for the 
mathematical side. Temperature renormalization means that if there is an explicit formula for the 
optimizer of the Mallows model on [0,1]^ in the thermodynamic limit, then restricting attention to the 
sub-squares Ay(0i,02), the restriction of the measure to these sets may be an optimizer for different 
choices of /3, due to dilution on the sub-squares. (We will refer to the Ay(0i,02)’s as “sub-squares” 
even though they are rectangles.) 

Of course, since the model is a mean-field model, there is an interaction between all particles in 
[0,1]^, including between different sub-squares. But then there is a special symmetry of the model. In 
two dimensions, it is common to see conformally invariant models, which is the symmetry one finds 
from local rotational symmetry as well as dilation covariance. For the Mallows model, instead of SO(2) 
symmetry the group which leaves the model invariant is the group of hyperbolic rotations SO’'’(l, I), 
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because one may rescale the two dimensions as long as the area remains fixed, in what is sometimes 
known as a “squeeze transformation.” 

Moreover, one can factorize the degrees of freedom of a measure on a square by first considering 
its X and y marginals, and then considering the measure in the square with those marginals, which we 
call the “coupling measure.” In the Mallows model, the different sub-squares only interact through 
their marginals. We think of the x and y marginals as data living on the boundary of each square, and 
the coupling measure as data interior. Then the choice of the coupling measure in the interior of each 
subsquare is not affected by the choice of the boundary marginal measures. So for each sub-square, 
the coupling measure is an un-restricted optimizer of the rate function , but at a diluted value of 
the inverse-temperature /3y. In this sense, the problem is “frustration free.” Moreover, this shifts the 
problem to determining the optimal choices of /3ii, /3i2, /32i, P 22 -, which are in turn explicit functions of 
tii,ti 2 ,t 2 i,t 22 due to the explicit formula for the pressure p(/3) = lim„_j.oo n~^ ln(Z„(/3)). So, in this 
sense the data ^ 2 ) may be deduced just from the general formula for the pressure p : R —>• R. 

1.1.2 Discrete symmetry and integrability 

The main point of this paper is to give a simple proof of the uniqueness of the optimizer of Ip, exploiting 
the symmetry just described in the continuum limit. But there is a related symmetry of the Mallows 
measure even for finite /3. For example, it is related to the Fisher-Yates-Knuth algorithm for perfectly 
simulating permutations. This is also related to the powerful bounds and approximations of Bhatnagar 
and Peled [7]. The symmetry may also be deduced from the “height-function” approach of Wilson 
[32] , which was exploited by |5] , relating the biased card-shuffling algorithm of Diaconis and Ram [9] 
to the Markov chain projection, which is the asymmetric exclusion process (ASEP). (Of course, in the 
present paper, we only consider invariant measures, not the actual stochastic dynamics, which is at 
least one level higher.) 

The ASEP is also unitarily equivalent to the anisotropic Heisenberg model, known as the XXZ 
model, with the anisotropy parameter A = (q + q~^)j2 with “kink soliton” boundary conditions. For 
more information on the XXZ model, see lindi]. This relation has a long history. See, for example, 
|8]. Also, for the relation between the Mallows model and the ASEP, an important reference point is 
to consider q = 1 where one sees the relation between the uniform measure on permutations and the 
SEP, where Liggett’s stirring process gives a graphical representation, which one may see in Chapter 
VIII of [16]. (This also leads to duality, and in this regard, one may also refer for the q ^ 1 case to 

m-) 

The most important points for us are 2: first the XXZ model is also frustration free [13]. This is 
important, because that implies certain properties such as a spectral gap [22] and certain correlation 
structure in the ground state|23). Secondly, there are certain equations related to the thermodynamic 
limits of the XXZ model such as the Liouville PDE [25]. The Liouville PDE is known to have the 
boundary symmetry we mentioned in the last subsection, related to the frustration free property [15] . 
(Also, see Section 1.1 of the published version of Tao’s blog, year 3, [29], for an elementary derivation 
of the symmetries and solution of the Liouville equation.) In this paper we avoid the partial differential 
equations. But that is an alternative route which has been explored before [27]. 

We do not try to relate the frustration free property of the XXZ model to the frustration free 
property of the LDP optimization problem. But we will state, in an appendix, the discrete version of 
the frustration free property of the LDP. 

1.1.3 Outline for the rest of the paper 

A brief summary of our paper is this. It is known that the “pressure” for the Mallows model is 
explicitly calculable. More precisely, this is related to the g-Stirling’s formula, which is also related 
to the dilogarithm (although we will not discuss that). Because of the symmetry we can reduce the 
4-square problem to the calculation of the pressure at diluted inverse-temperatures. This dilution, or 
“temperature renormalization,” arises in all mean-field problems (where the statistical mechanics setup 
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is deficient because the Hamiltonian itself explicitly depends on the system size). The important point 
is that, due to the frustration free property, we may reduce the four-square problem of calculating 
.f/3(bbei,e2(fe)fj=i) to a 1 -dimension optimization problem related to the density of points in each 
subsquare, not the sub-permutations. Even though the weak LDP is not strictly convex, this particular 
problem is. That explains why there is a unique minimizer in this problem, despite the lack of convexity. 

Although the problem is related to interesting topics in quantum statistical mechanics, the main 
results and tools, henceforth, will be purely probabilistic. We do not make any further reference to 
quantum spin systems (such as the XXZ model) or partial differential equations (such as the Liouville 
PDE). The approach may be viewed as purely probabilistic by probabilists. 


2 Set-up 

Let A be the standard Lebesgue measure on [0,1]. Let denote the standard Lebesgue measure on 
[0,1]^, which is a Borel probability measure on K^. 

For any n £ {2,3,...} and £ R, let us define the measure fin,p £ 1]^)") to be the 

absolutely continuous measure with respect to (A®^)®" such that 

^^^g^((xi, 2 /i),...,(a;„, 2 /„)) = exp [-/3iL„((xi,2/i),..., (x„,?/„))] , 


where 


for 


n—1 n 




i—1 


h{{x^,yi),{xj,yj)) = l(_oo,o) ((a^i - a;i)(y* - 2/j)) 
and where Zn{l3) is a normalization constant 


( 2 ) 


Zn{/3) = / exp [-/37L„((a;i,?/i),...,(a;„,j/„))] TTdA®^(a;j,j/j). 
•t([o,i]D" r=i 


The main result of this paper relates to the weak large deviation principle for this sequence of measures. 
Let us define the finite-volume approximation to the pressure 

PnW) = - ln(Z„(^)) . 

n 


2.1 Relation to the Mallows measure on permutations 

Given a parameter q £ (0,oo), the Mallows measure on permutations is a probability measure on the 
symmetric group. More precisely, the probability mass function is Pn,q : 5"^ —t R, where, for a given 
permutation tt = (tti, ... , 7r„) £ Sn, we define the inversion number and Pn,q as 

„inv(7r) 

inv(7r) = #{(f, j) : i < j and Wi > nj} and Pn.qiT^) = -, where Zn,q = ^ . (3) 

Diaconis and Ram showed that the measure y,n,q is related to the Iwahori-Hecke algebra [S]. One 
related fact is the special formula for the normalization: 


n 


n 


1 - 
1-9 


(4) 
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The g-integers are defined as [k]q = {1 — q’^)/{1 — q) for each k G N = {1,2,...}. The q-factorial 
function is 

n 

[n]q\ = nt*]?- 
fc=l 

Hence Zn,q = [n\q\. 

The relation between the measure ^n,i 3 in Af+,i(([0,1]^)") and the probability mass function : 
Sn ^ [0,1] is elucidated in the following lemma. 

Lemma 2.1 For each n G N and each /3 G M, 

) g=exp(-^/(n-l)) 

Proof: Since the Lebesgue measure is permutation invariant, we may symmetrize to calculate 


Pn(/3) = - In 
n 


\q- 


„np„{0) _ 


r ^ n n 

/ y' exp [-/3iJ„((a:^i,j/i),...,(a:^„,?/„))] TTdA(a:j)TTdA(j/j). 

Jao.m- j=i 


For Lebesgue-a.e. choice of {xi,yi), ..., (a;„, ?/„), we have 

= i:( 

TTGSn TTGSu 

But this quantity is precisely Zn^q for q = exp(—/3/(n — 1)). So the result follows from (|1]). □ 


E 




„-/3/(n-l) 


, #{(*J') : i<j and (x^ 


j)(Vi-Vj)< 0 } 


This proof demonstrates the relation between /in,/? and Pn,q' 

= Pu,q{^), a.s., 

where q = exp{—P/{n — 1)) and tt = 7 r((a;i, j/i),..., (a:„, ?/„)) is the (A®^)®"-almost surely unique 
permutation such that 

yTZi ^ yTTj ^ ■ 


Corollary 2.2 For each ,5 G K 

p(/3) = ^'ln(^i^^) dx. (5) 

Proof: A more general and precise result is true, which we will mention immediately after this proof. 
This easy result follows from 



n 



q=exp{-P/n) 


In 


1 - e"^/” 
jdn 


1 


n 


El- 



Aa;„ , 


where 


Xn,k = - and Aa;„ = - . 
n n 

The first term is easily seen to converge to 0. The second term converges to the Riemann-Stieltjes 
integral. Note that we rescaled /3 by (n — l)/n in this formula, but the limiting formula (involving the 
integral) is continuous in /3. So this rescaling does not matter in the limit. □ 


A more general and precise result than this one is true. It is called the g-Stirling formula. It was 
first proved by Moak [TH]. We will discuss this further in the Outlook, Section [SJ since it is related 
to the quantitative version of our main result, which may also be useful for the studying fluctuations, 
especially in the singular scaling of Bhatnagar and Peled from their paper [7] . 
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3 Statement of main results 


For two measures fj.,v on a Borel probability space on a compact metric space X, let us define the 
relative entropy in the usual way 


S{fj.\iy) 


—oo it n ^ ty, 

if/x < i/. 


Then the following is an important consequence of general principles. 

Proposition 3.1 Suppose 13 gM. is any fixed number. For any Borel subset A C Af+p([0,1]^), 


lim sup — In 

n—>oo ^ 


Mn,/3 ( I {ixi,yi), (Xn, j/„)) G ([0,1]^)” : ^ e ^ 




< sup (S'(i/|A®^) - -p(/3)) , 

v^A 


and 


lim inf — In 

n—¥oo Ti 


Mn,/3 ( I {{xi,yi),...,{Xn,yn)) G ([0,1]^)” : ^ G A 


k^l 


> sup (SiiyjA'^^) - /3£(iy)-p(/3)) , 


where £{v) is the expectation of the energy in a product state v ® v: 

= ^ / / h[{xi,yx),{x2,y2))dv{xx,yi)diy{x2,y2), 

^ aio.iF aio.iF 


where h is as defined before in m- 

Proof: This is a special case of a more general theorem proved by Mukherjee in [21]. We will discuss 
this more, momentarily. Mukherjee also noted that the result had previously been proved by Trashorras 
[30j . The key for both of them was to rephrase permutations in terms of empirical measures. For us, 
we are merely focusing on the measures: i.e., the particles in [0,1]^. We did mention the relation to the 
Mallows measure on Sn in the last section. We will also return to this issue in the Section |8l outlook. 
But for now, we merely focus on the measures, not the permutations. 

But in this context, one may also refer to a beautiful monograph [10] . Ellis considered large 
deviation principles for mean-field statistical mechanical models. Then this proposition follows exactly 
from Theorems II.7.1 and II.7.2 on pages 51-52 of Ellis’s monograph. This is the reference that we are 
most familiar with. □ 


The large deviation rate function is 

Xp{v) = - (5(xx|A®2) _ - p{fi)) . (6) 

Part of Mukherjee’s, and Trashorras’s and Ellis’s proof of this proposition entails the fact that Xp{v) 
has infimum equal to 0, as is needed on basic probabilistic grounds. The large deviation rate function 
Xp is lower-semi continuous. Therefore, any infimizing sequence possesses a limit point which is a 
minimizer. In particular, there is at least one minimizer. Sometimes we will denote this existential 
minimizer as G AI([0,1]^). 

Our main result is uniqueness of the minimizer. We state this in a sequence of steps. The main 
step involves the “four-square problem,” which we describe, shortly. First we would like to comment, 
briefly, on the papers of Mukherjee and Trashorras, which are critical for our own short article. 
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3.1 Short discussion 

Let us quickly comment on one important aspect of Mukherjee’s paper ED, first. He not only calculated 
the large deviation principle for the Mallows model, but also for many other non-uniform measures on 
Sm of which the Mallows measure is just one. Mallows, himself, considered various measures on Sn, 
and what we are calling the “Mallows measure,” here, is actually just the Mallows measure relative to 
a particular distance function, which is the minimum number of nearest neighbor transpositions (i.e., 
transpositions of the form {i,i + 1) for some i G {1,... ,n — 1}, sometimes called Coxeter generators) 
needed to transform a given permutation into the identity permutation. Mukherjee has explained all 
of this. Trashorras has also considered many generalizations of the uniform measure on permutations, 
too. In this sense, they are the same. But Mukherjee also considered properties of the large deviation 
problems. 

We would also like to mention that Trashorras was motivated by models of permutations arising 
from Adams, Bru, Dorlas and Konig relating to Bose condensation [Dlllll]- Another important 
reference is by Betz and Ueltschi [ 6 ]. Following a methodology of Ueltschi, they are also related to 
quantum spin systems [12] . But we have not found a direct connection to the Mallows model, yet, since 
the cycle type is more important than the inversion number for all of those applications. Nevertheless, 
there is a direct relation between the Mallows measure and the XXZ model that we mentioned in the 
introduction. We will return to this in the appendix. 

There is a particular type of statistic, which one may call “linear statistics” for which Mukher¬ 
jee proved uniqueness of optimizers of the LDP, in those cases. He considered measures Qn,si'^) = 
exp[ 6 *^"^^ f{ijn,Trijn) — ln(Z„(/, 0))] for a given continuous function / : [0,1]^ —>■ R. In this case, 
the large deviation rate function is equal to the Kullback-Leibler divergence (which is the negative of 
the relative entropy, relative to the uniform measure) plus a linear functional of the measure. But 
the Kullback-Leibler divergence is strictly convex, and addition of a bounded linear functional does 
not affect this. One might expect that if there is uniqueness of the optimizer of the LDP for the 
Mallows measure that this follows from convexity. But, the rate functional Ip is not strictly convex 
on AI([0,1]^). (Note that if we describe everything other than the Kullback-Leibler divergence as an 
effective Hamiltonian, then this is quadratic in the measure, somewhat like the logarithmic potential 
although less singular, not linear. Hence it need not preserve convexity.) We will give an example 
calculation to show this in the appendix. Nevertheless, there is a class of events that does lead to 
convexity for a 1-parameter family of sub-problems. That is how we proceed. This is the 4-square 
problem, which we now describe. 

3.2 Four square problem 

For any 9 G (0,1), define subsets of [0,1] as 

Li(0) = [O,0], L2(0) = (0,1]. 

For each point ( 01 , 02 ) G (0,1)^, we define four rectangular subsets of [0,1]^: 

A,, ( 01 , 02 ) = L.(0i) X L, ( 02 ), for i,j G {1,2}. 

Let S 4 = {(tii,fi 2 ,t 2 i,t 22 ) S [0,1]"^ : bi-l-ti 2 +^ 21+^22 = !}• Given a point in M, we define a Borel 
subset of AI+,i([0,1]^), as 

^6*1,6*2(^111b27^21,^22) = € A 4 +p([ 0 , 1 ]^) : Vi,jG{l, 2 }, i/(Ay( 0 i, 02 ) = Uj} 

What we call the “four-square problem” is to calculate the large deviation of this set. Solving this prob¬ 
lem for all possible values 0 i ,02 G (0,1)^ and all (^ 11 ,^ 2 ,^ 217 ^ 22 ) G E 4 will lead to all the minimizers 
of 1 ^ 3 . 
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Theorem 3.1 For each ^ G K, and each (0i,02) G e®c/i (in,^12,^2i,^22) G E4, 

T[m\{Xp{v) : G Wsi,@ 2 (^ 11 ,ii2,i2i,t22)} = 5/3(6'i,fi'2 ; ill, ^12, ^21, ^ 22 ), 

where 

^>/3(6'l,fi'2;ill,il2,i21,i22) '?= p{P)+ iX^) X! 

i,j=l i,i=l 

— (ill + ii2)p(^(iii + ii2)) — (ill + i2i)p(^(iii + i2i)) 

— (il2 + i 22 )p(/ 3 (il 2 + ^22)) — (i21 + t 22 )p{P{t 2 l + ^22)) 

+ / 3 ii 2 i 2 i • 

Actually a subset of the four-square problems suffices to find the minimizers of Xp. 

Given a measure v G Al+,i([ 0 , 1 ]^), let us define the x and y marginals as vx^vy G Al+,i([ 0 , 1 ]) 
defined as 

vx{-) = jz(-n[0,l]) and vy{-) = z/([0,1] n •) ■ (8) 

Theorem 3.2 For each /3 G M, any v G Al+,i([ 0 , 1 ]^) satisfying Xp{u) = Q, vx = vy = A. 

Because of the lemma, we may restrict to (in, ii2, i2i, i22) G S4 such that in-|-ii2 = Qi andin-|-i2i = O2 
because those are the A measures of [O, 0 i] and [0,02]- This reduces the parameter from general 
(in, ii2,i2i, i22) G S4 to just in in a certain interval. We define 

= [max{0,6>i-I-02 - l},min{6'i,6>2}]. 

Then we define ^p{di, 92 ]-) : —t R as 

^p{ 9 i, 92 \t) = $^(01,02;i,01 — i,02 — i, 1 — 01 — 02 + i) ■ 


Theorem 3.3 For each /3 G R, and each ( 0 i, 02 ) G (0,1)^, the function ‘h/ 3 ( 0 i, 02 ; •) : hi ,02 —t R is 
strictly convex, and the unique critical point is given by 


t = i?/3(01,02) 


def 


1 


:= ——In 1 — 


(l-e-/^^i)(l-e-^^^) 
1 — e~9 


(9) 


This theorem easily leads to the following, which is the main summary of the results. 

Theorem 3.4 The unique measure v G Af+p([0,1]^) such thatXp{v) =0 isv = Vp, where dv^{x,y) = 
p*p{x,y) dx dy for x,y G [0,1]^, where 

= 5^^/3(a;,y). 

Proof: Any measure v G Al+p([0,1]^) such that Xp{v) = 0 must satisfy i/x = = A by Theorem 

13.21 Then, for any (0i, 02 ) G Al+,i([0,1]^) the measure v is in W 0 ^fi,^{t, 0i — i, 02 — i, 1 — 0i — 02 + i) 
for t = j/(An(0i, 02 )). So by Theorem l3TI 4’flf0i . 09: t. 0i — t, 02 — i, 1 — 0i — 02 + i) = 0. But then, by 
Theorem l3.31 we see that this means that i = Rp{ 9 i, 92 ). In other words, j/([0, 0i] x [0, 02 ]) = Rp{ 9 i, 92 )- 
These rectangular measures completely characterize 12 : in fact they give the standard formula for the 
multidimensional distribution function. So uniqueness is proved. We now call it Note that 
Vp A®^ because the relative entropy is not — 00 . Therefore, there is a density, which may be 
calculated by differentiating the distribution function. □ 





Let us comment on the relation to the absence of phase transitions for this model. 

For a mean-field model from statistical mechanics, one needs to replace the usual Dobrushin- 
Lanford-Ruelle definition of equilibrium states by an appropriate analogue. When a weak large de¬ 
viation principle exists, as it does here, the correct analogue, in the sense of the Boltzmann-Gibbs 
variational principle, may be viewed as I/ 3 {v) = 0. Therefore, this result is a version of absence of 
phase transition in this model, in the sense of uniqueness of equilibrium states. 


4 Standardizing the measure 

This is a solvable model. This is manifest in the symmetry of T.(•)• We describe this, now. 

Definition 4.1 Suppose that G Al+q([0,1]^) satisfies -C Suppose that Gx,Gy ■ [0,1] —?> 
[0,1] are two absolutely continuous probability distribution functions. Then we may define a new 
measure v = Gx, Gy) as 

n{[0,x] X [0,2/]) = i/(°)([0,Gx(a;)] x [0,Gv(2/)]). (10) 

Proposition 4.2 Suppose that G Al+,i([0,1]^) satisfies -C and = A. Suppose 

that Gx,Gy : [0,1] —>■ [0,1] are two absolutely continuous probability distribution functions. Defining 
V = Gx, Gv), we have that z^x([0, a]) = Gx(a), ^'v([0, a]) = Gyia) for all a G [0,1]. Moreover, 

S{u I A®2) = 5(z.(o) I a®2) + s{vx I A®i) + S{vY I A®i), (11) 

and 

£{n) = (12) 

Proof: The fact about the marginals follows directly from the definition (TTOll and the definition of the 
marginals dH) and = i/y^ = A. 

Let us write : [0,1]^ —)• [0,oo) for the density associated to . Note that p^^\x,y)dy = 1 
for X, A-a.e., and similarly f^p^^^x,y)dx = 1 for y, A-a.e, because = Vy^ = A. Let us write 
gx ■ [0,1] —>■ [0,oo) and gy ■ [0,1] —>■ [0,oo) for the density functions associated to Gx and Gy. Then, 
from m and the chain rule we have 

= P^°\Gx{x),GY(y))gx{x)gY{y). (13) 

Therefore, 

^(i/IA®^) = -J J \n(^p^°\Gxix),Gy{y))^ p^°\Gxix),Gy{y))gxix)gy{y)dxdy 

- ^n{gx{x))diyix,y)- \n{gy{x)) diy{x,y). 

J[0.1]2 J[0.1]2 

In the second integral we integrate over y first and use the fact that the marginal vx has density 
function gx and in the third integral we integrate over x first and use the fact that i^y has density 
function gy. In the first integral we make the change of variables x = G{^{u) and y = Gy^v), and 
then we obtain 


S{n\X‘^‘^) = - J J \n(^p^°\x,y)^ p^°\x,y)dxdy 

- [ ^n{fxix)) fxix)dx - f \n{fy{y)) fy{y) dy . 

Jo Jo 
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The first integral follows from the chain rule for Stieltjes integrals. See, for example, [26], Chapter III, 
especially Section 58. 

The result for the energy follows from a similar calculation, using the same method. □ 

Now, given any v G AI+p([0,1]^), if i/ <C then vx and vy, defined in (l8|) are both absolutely 
continuous with respect to A. We define '■ [0,1] —>■ [0,1] as 

Fy,x{a) = iyx{[0,a]) and Fyy{a) = i/y([0,a]), 

for each a G [0,1]. These functions are both continuous. 

Definition 4.3 Given a distribution function G : [0,1] —>■ [0,1], let us denote the generalized inverse 
as : [0,1] —^ [0,1], where the condition to be a generalized inverse is 

Vx G [0,1], G\x) = inf(WG(x)), where Ug{x) = {a G [0,1] : G{a) > x} . 

Generally speaking, by right-continuity of G, we have G(G^(x)) = inf{G(a) : a G Ug{x)} > x. Also, 
if y < G^(x) then y ^ Ug{x) so G{y) < x. Hence, 

G(G^(x))>x, and (y < G^{x) => G{y) < x'j . (14) 

That is true for any distribution function. More is true if G is continuous. 

Lemma 4.4 Suppose that G : [0,1] —>■ [0,1] is a continuous distribution function. Then G^ is a right 
inverse for G: for all x G [0,1], G(G^(x)) = x. 

Proof: Suppose that we had G(G^(x)) > x. Since G is continuous, there would exist some y < G^(x) 
such that G{y) > x, as well. But this contradicts (fT4)l . □ 

Definition 4.5 Ifv£ A4+4([0,1]^) satisfies v <C A®^, then define v G Al+p([0,1]^) to be the measure 
such that 

9([0,x] X [0,y]) = n{[0,Fl^xix)] x [O, (y)]), (15) 

with the notation as above. 

The important relation between v and 9 is the following. 

Lemma 4.6 If iz G A4+p([0,1]^) satisfies v <C A®^, then (v)x = (^)y = A and 

iz{[0,x] X [0,y]) = v{[0,F^^xix)] x [0, F^^yiv)]), (16) 

for all X, y G [0,1]. In other words, v — Tt(9, F^^^y). 

Proof: To simplify notation, let us just write Fx,Fy,F^, Fy in place of F,,^x,F„^y, Fl^,FI y. 

By definition 

(9)x([0,x]) = P([0,x] X [0,1]) = z/([0,F^(x)] x [0,F^(1)]). 

Now we note that this means 

(T)x([0,x]) < iz{[0,Fj^{x)] X [0,1]) = izx{[0,Fj^{x)]) = Fx{F^{x)) = x, 
using Lemma lT4l But in fact, 

X-{v)x{[Q,x\) = z/([0,Pi-(x)] X (F^(l),l]) < z/([0,1] X (P^(l),l]) = z/x((Py(l),l]), 

and this equals TV(1) — Fy(Fy(l)), i.e., 1 — Fy{FY(l)), which is 0, again by Lemma |T4] So this 
proves that iv)x = A, and the corresponding fact for the y marginal follows, similarly. 
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Now, applying (USD with X replaced by Fx{x) and y replaced by Fyiy), 

H[0,Fx{x)] X [0,Fr{y)]) = F^{Fx{x))] x [O, F^(F^(y))] ). (17) 

We note that (fT^ implies that if a < Fj^{Fx{x)) then Fxia) < Fx{x). Since Fx is non-decreasing 
this implies that for every a G [0, 1], if we have a < F^{Fx{x)) then a < x. In other words, this all 
implies that F^{Fx{x)) < x. Using this, we may see that (flTll implies that 

P([0,Fx(x)] X [0,JV(y)]) < z/([0,x] x [O,?/]). 

But then by the same type of argument as before, using dm, we actually have 

i'i[0,x] X [0,?/])-P([0,Fx(a;)] x [0,Ur(j/)]) 

< i/x([0,a;]) - tyx{[0,Fx{Fxix))]) -bz^F([0, y]) - z^f([ 0, Fy(Fy (y))]). 


And this equals 0. □ 

Finally, we state the scaling properties of the entropy and energy, under the transformation of v to P. 
Corollary 4. 7 For any v G AI+4([0,1]^) with v <C 

S(y I A®2) = S{v I A®2) + S[vx I A®i) + Sivy | A®i), (18) 

and 

£{v)=£{V). (19) 

Proof: Combine Proposition 021 and Lemma [4.61 □ 

We may now prove Theorem 13.21 (We will prove Theorem 13.II in a later section.) 

Proof of Theorem 13.21 From (fTSl) and (|19l) , it follows that 

= T0{V) - SiirxlX) - 5(i^v|A). (20) 

Hence 

Xp{D) = Ip{iy) + 5(z/x|A) + 5(r^v|A). 

But the relative entropy is nonpositive and equals 0 only at the unique maximizer A. Therefore, since 
Xp can never be negative, we see that Xp{v) can equal 0 only if vx = vy = X. □ 

We can think of this proof as a partial solution of the one-square problem, for [0,1]^. More 
specifically, while it does not give the unique measure v G A4([0,1]^) solving I/ 3 (i^) = 0, it does give 


the X and y marginals. The next step is to consider the two-square problem. For the two-square 
problem, we will not determine the exact marginals. But we will prove a scaling which will ultimately 
let us solve the four-square problem. 

5 The two-square problem 

Given any 9 G (0,1) we define Ai(0) = [0,1] x [0,0] and A2(0) = [0,1] x (0,1]. Since vy = A, we have 
z^(Ai) = 0 and i/(A 2 ) = 1 — 0. We define two new measures G A4+,i([0,1]^) by 

z/(^l(.) = 0“^z/(-nAi) and = (1 — 0)“^z^(-fl A 2 ). 

Since vy = A, we have for the y-marginals of and 

i'y\-) = 0“^A(-n [0, 0]) and dz/y^(y) = (1 — 0)“^A(-fl (0,1]). (21) 

The X marginals and zz^^ may be more complicated. But we can prove the following result. 
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Lemma 5.1 Assuming that v € Al+,i([0,l]^) has v <C and vy = A, using the notation above, 

Ip{v) = p{p) - 0pm - (1 - %(/3(l - 0)) + + (1 - 0)I;3(i_,)(9(2)) 

- I A) - (1 - 6*)5'(^'^^ I A) +/36»(1 - 6») f f - xi) diy^^\xi) diy^^\x 2 ) ■ 

Jo Jo 

Proof: A straiehtforward computation using the definition of the entropy shows that, defining s(0 ,1 — 
0 ) = - 0 \n{ 0 ) - {1 - 0 ) ln(l-6»), 

I^(zy) = p(/3) - 3(0,1-0)- - (1 - 0)Sm^m) + /36»2£'(i/(i)) + /3(1 - 0)‘^S{v^‘^^) 

+ I30{l-0)f f h{(xi,yi),{x 2 ,y 2 )) diy^^'’(xi,yi)dv’-'^\x 2 ,y 2 ) ■ 
d[0.1]2 J[0.1]2 

We note that, because has support in and has support in A 2 we may rewrite this as 

i^{v) = p(i3) - 3(0,1-0)- 0sm'>m) - (1 - 6i)5'(z/(2)|a®2)+ msm'))+ /3(i - 0)'^£m'>) 

+ ^ 0 {l- 0 ) / l(^_oc,o){x 2 - xi) diyxHxi) diy^P {X 2 ). 

Jo Jo 

Now we note that using the definition of for all /3’s, this may be rewritten as 

im = pW) - sio, 1-0)+0im^^'^) - Mm + (i - 0)10(1-0)^"^) -0- Mim - m 

+ P 0 {l- 0 )j' [ l(_oo,o)(a^2-a:i)dz/^^(xi)(iz/^^(x2), 

^0 Jo 

Finally, using (j2Qp . we may rewrite this as 

i 0 {iy) = piu) - 3(0, 1 - 0 )- 0 pm - (1 - 0 )pm - 0 ))+ 0 Xp 0 {m) + (i - 0 )Xp^,_e)M^^) 

- 0S{v^x^ I A) - 0S{v^Y^ I A) - (1 - 0)S{v^x^ I A) - (1 - 0)S{v^y^ \ A) 

+ I30{l-0)f f l(_oo.o)(a^2 - dz^^^(a:2), 

Jo Jo 

Using (EU, we may calculate 5'(z/y^ | A) and 5'(z/y^ | A). This yields the desired result. □ 

Corollary 5.2 For any measures y,,Jl G Al+p([0,1]), both of which are absolutely continuous with 
respect to X, we have, for each /3 € R and each 0 G (0,1), 

- 0S{p\X) - {1-0)S{'jl\X) + fl0{l-0) f f lt^_^^Q-j{x 2 - xi) dp{xi) d'jl{x 2 ) 

Jo Jo 

> 0pm + (1 - d)p{l3{l - 0)) - p{p). (22) 

Moreover, there does exist a pair of measures p,Jl € Af+,i([0,1]) giving equality. 

Proof: Recall that for each /3 G R, we do know that there exists at least one measure in Al+p([0,1]^) 
which minimizes Xp, using soft analysis, especially lower semi-continuity and weak-compactness. For 
each /3 G R, we choose one such measure and call it v*^. 

Let us define k,k G A1+4([0, 1]) by dK{y) = 0-^1^^^) and diiiy) = (1 - 0)~^\f^e,i]{y) dy. 
Then we may define two measures G A1([0,1]^) by ^ = ^iv*peT where and are the 

distribution functions for p and k, respectively, and with a similar definition for ^ based on v*p^i_ 0 y 
p and K. Then taking v = 0f, + {1 — 0)^, we will have et cetera. In particular, we 
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will have = Ip{i-g){v^'^'>) = 0 because Xpe{v*pg) = 0 and 2:/3(i-e)(^'^(i_e)) = 0. So applying 

Lemma lOI and using Proposition 13.11 we obtain the inequality. 

To prove that there does exist a case of equality, take v S Al+,i([0,1]^) to be v*p, which we know 

exists since we do know that a minimizer for Ip does exist. Then taking the and as at the 
beginning of the section, we obtain equality. □ 


6 The four-square problem 

Suppose that 0i,02 G (0,1). In order to simplify notation, let us denotes Aij(0i,02) just as Ay, for 
this section. We assume that we have (^11,1127^21,^22) G S4, defined. Let us assume, for now, that all 
four numbers are positive. We will say later what changes if some ly equals 0. 

We consider v G (In, I12, l2i, 122)- We define four measures, j/Oo) for j g {1,2} by 

dv^^'^\x,y) = t~^lK,^{x,y)dv{x,y). 

Lemma 6.1 Assuming that v <C and vx = vy = A, using the notation above, 


Ip{v) = piP) + til In(lii) + li2 ln(li2) + I21 ln(l2i) + I22 ln(l22) 

2 

+ I I A) + - UjpiPtij)^ 

+ /3I11I12 / f i(-oc,o){x2 - xi) diy^^'^\xi) diy^^’'^\x2) 

Jo Jo 

/•I pi 

+ /3I11I21 / / l(-oo,0)iy2 - yi) diy‘^’^\yi) diy^’‘^\y2) 

Jo Jo 

+ I3t2it22 f f l(-oo,0){x2 - Xi) diy^x^\xi) dty^^’‘^\x2) 

Jo Jo 

+ [3ti2t22 f f 'i-(-oo, 0 ){x 2 - Xi) diy^''^\yi) diy^''^\y 2 ) 

Jo Jo 
+ /3112121 • 

Proof: One goes through the same steps as in the proof of Lemma 15.11 


(23) 


□ 


We want to use this to prove Theorem 13.11 The idea of the completion of the proof is to use 
Corollary 15.21 But first we generalize it slightly. 

Corollary 6.2 Suppose v^x ^ Al+p([0,1]), both have support inside an interval [a, b] with b—a = 
9 G (0,1). Then, for each /3 g R and each ti,t 2 G (0,1), 

- Ii5'(^'^^^ I A) - l25'(^'^^ I A) +/3lil2 [ f 1(-oc,o){x2 - xi) diy^x\xi) diy^x^^^) 

Jo Jo 

> -{h + I2) ln(6') + tip{(3ti) + l2p(/3l2) - (li + t 2 )p{P{ti + 0)) ■ (24) 

Moreover, there does exist a pair of measures g A4+p([0,1]), both having support inside [a, b], 

giving equality. 

Proof: This follows from Corollary 15.21 bv making some scaling transformations. We may define v'^x 
and v^x by 

v^x (^) = ({® + (^ ~ ^)x '■ X G A}). 
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It is easy to see that I I ^)- With this, the result follows by straightforward 

calculations. □ 


Now we may prove Theorem 13.11 including the case where some tij may equal 0. 

Proof of Theorem 13.11 At first, assume that all tij’s are strictly positive, as we have been 
assuming in this section, up to this point. Combining Lemma l6 .1 1 with Corollarv l6.21 and writing |Aij| 
for A®^(Ay), we obtain 


i,j—l ^ ^ 




i,j=i 


ij=l 


(25) 


— (til + ti2)p(/3(tii + tl2)) — (ill + i2l)p(/3(tll + t2l)) 

— (ti2 + t22)p(P{tl2 + ^ 22 )) — (^21 + t22)p(l3(t21 + ^ 22 )) 
+ Ptl2t21 ■ 


Finally, using Proposition 13.11 we obtain 


1 -p(v) > p(P) + X 

ij'=l 




X ^dP(/ 3 ^d) 

bi=i 


— (til + ti 2 )p(/ 3 (iii + ^12)) — (ill + i 2 i)p(li(iii + ^21)) 

— (fi2 + t22)p(P(il2 + ^ 22 )) — (^21 + i22)p(/3(i21 + ^ 22 )) 
+ I3ii2i2i ■ 


(26) 


We use Proposition 13.11 to lower bound by g everywhere. 

The cases of equality follow from the cases of equality in Corollary 16.21 as well as the fact that i/* 
measures do exist to give the minimum in I.. There is no obstruction to putting these together using 
91 somewhat similarly to what we did in the proof of 15.21 but with 4 squares instead of 2. Corollary 
16.21 give marginals, and v* gives values for 9’s. Then we may use Definition 14.11 as a prescription for 
obtaining . 

Finally, for the case that some Uj equals 0, all that happens is that the actual value of is 
irrelevant. In particular note that there is no source of discontinuity of S arising from this. The 
Lebesgue measure |Ay | is fixed and positive because ( 01 , 62 ) S (0,1)^. All that happens is that the 
density becomes zero. But (j)(x) = —x\ii(x) is continuous at 0, since it is defined to be ^(0) =0. □ 


7 Calculus facts 

We have now proved Theorem 13.11 and Theorem 13.21 (We proved them in opposite order.) The proof 
of Theorem ESI now occupies us. It follows from calculus exercises. We will sometimes write iij for 
T^j(Oi, 02 ;i), where 

Tii(9i,92',i) = i, Ti2(0i,92;i) = 9i—i, T2i(9i,92',i) = 62 —i, T22(9i,92',i) = \ — 9i — 92+i. 

Let us summarize the main results. 

Lemma 7.1 (”oj For t G = (max{0, di + 02 — 1}, niin{di, 02}), we have 




-^<^p(9i,92-,i) - X 


/3 


2tanh(/3Fj/2) ' 


(27) 
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(b) The critical point equation p(9i,92]t) = 0 is equivalent to 


(1 - e-^*“)(l - 


= 1 . 


(e/3ti2 _ l)(e/5*2i _ 1) 

Proof: (a) Using ([T]) and the definition of Tij{9i,92\t), for i,j S {1,2}, we have 
^p[9i,92] t) = $/3(0i, 02; 'Pii(0i, 02; t), T'i2(0i, 02; t), 121(01,02; t), 721(01,02; t)) 

= p{f3) - 0ip(/30i) - 02 p(/ 302) - (1 - 0iM/3(l - 0i)) - (1 - 02 )p(/3(1 - 02)) 


( 28 ) 


+ 

i,t=l 

Moreover, from (O, we see that 


|A, 


+ X! + -^^12^21 ■ 

j,t=l 


90 


[0p(^0)] = In 


1 - e-P'^ 
/3t 


(29) 


Using this with ([29|, we see that 
2 


^$/3(0i,02;t) = 

* j=i 


1 + In 


|A, 




*j'=l 


1 - e \ 9t2 


PU 


Substituting in the partial derivatives BUj/dt = ^Tij{9i,92',t), we obtain 


2 


■ fit2i 


dt 


12 


9t 


I *,(«., 0.;*) = E (-!)•« in (illi) + S in (1^) - + *..) ^ 

— 1 'i'li — 1 


This may be rewritten as 


_9 

Wt 


$/3(0i,02;t) = E(-l)*+^ln(^^^^)-/3(U2+t2i). 


(30) 


So, taking the second derivative we obtain 


9-f\= Vr 1V+^' ^ 


* j=i 


9ti2 9t21 
dt 9t 


2 


2;9. 


Simplifying, this does give equation (EH). Note that for ,5 = 0 we interpret this as the /3 —>■ 0 limit 
which is 

(b) Equation (1301) may be rewritten as 


_9 

dt 


‘h/3(0i, 02; t) 


In 


In 


/(I - e-^‘“)(l 
V(1 - e-/5‘i2)(l 



/(I - e"^‘“)(l - e"^‘“)\ 

\ (e^‘i2 — l)(e/5*2i _ 1 ) y ■ 


Therefore, the critical point equation is (1^ . 


(31) 


□ 
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Now let us prove Theorem 13.31 

Proof of Theorem 13.3t Firstly, we note that ^‘h/ 3 ( 6 *i, 6 * 2 ; t) is manifestly positive on 
Therefore, $,a( 0 i, 02 ; •) : ^ 01,02 —>■ R is strictly convex. Next we attempt to solve (1^ . We recall that 
tij = Tij{9i,02',t). In particular. 


So, if we define 
Then we get 


til — t , ti2 — 9i — t, t 2 i — 82 — t, t 22 — 1 — 0i — 02 1. 

u = 1- = 1 - . 


I_f,-I3t22 ^ I _ f,PiSi+92-l)^-l3t ^ ^13(6^+62-1) _^_^l3i9i+e2-l)y^^ 

g/3ti2 _ I ^ - 1 = - u ), 

g/3t2i _ I ^ - 1 = - 6 “^®= - u ). 

So (1^ is equivalent to 


(1 


u[i — e' 


_ ^(01 + 02-1) I ^^(01+02-1), 


U = I. 


gP{9i+92)(\ _ Q-PSi _ 'u)(l — — u) 

The equation is equivalent to 

u — e~^ + e~^v}j = (1 — — u)(l — — u). 

Doing one more step of simplication, we obtain the equivalent formulation 
(1 - e-^)u^ - [(1 - + (1 - - e-^]u + (1 - 


(32) 


= 0 . 


We may simplify this as 

(1 - e-^)w2 _ [(1 _ + (1 - e-'^)]u + (1 - e-^®i)(l - = 0. 

Or, splitting the polynomial, 

[(1 - e-'^)u - (1 - e-^®i)(l - e-^^^)] (w - 1) = 0. 

There are two solutions in the complex plane: u = 1 which will not correspond to rt = 1 — for any 
t G 101,02! 

(1 — e“'®®0(l — 

"" ^ l-e-9 ■ 

This leads to the formula ([9|). 

We should check that t = i?/3(0i, ^ 2 ) is in T 0 i, 02 - First, we may calculate 


^i?/3(01,02) 


= r 


cdf ,pdf 


(01,02) = 


e — e ^ 


’0 




(33) 


It is easy to see that i?^(0i,O) = 0 for all 9i G (0,1). Therefore, we do have 


R^(01,92) = 


(•02 


cdf,pdf 


{0i,y)dy. 


(34) 
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The second derivative calculation is slightly more involved, involving fractions: 


^ ^cdf,pdf/^ ^ ^ ^ 




cai,pai^^ ^ \ ) 

rp {01,02) - ^ 


=-/3e2 


(1 




) 


d 

Wi{l- e-P) - (1 - e-/5ei)(l - e-/5»2) 


= /3e 


-pe2„-pe 


■( 


(l-e-/5)-(l-e-/5ei){l-e-/5»2) 
(1 - e-^®i)(l - 


[(1 - e-/3) - (1 - e-/3»i)(l - e-/5e2)]2 

/3(1 — 


[(1 - e-/3) - (1 - e-/3«i)(l - e-/5e2)]2 ' 

This is manifestly positive for all f3 G M.. (At /3 = 0 this equals 1 by taking limits.) Moreover, from 
(P|l r^‘^^’P‘^''(0,6»2) = 0. So 

_ g-/3)g-/3xg-/3e2 


rf’^'^\e„92) = [ 
Jo 


[(1 - e-P) - (1 - e-/3^)(l - e- 


Putting this together with (IMl) . we see that 


Rp{()i,(^ 2 ) — f 

J[o,ei] 


/3(1 - e-^)e-l^^e-Py 


■ dx. 


■ dxdy. 


]xM2] 

Direct calculation of Rp{9i, 1) and Rp{l, ^ 2 ) shows that it does have the correct marginals. So, it does 
follow that t = Rp{9i,92) is in g^. □ 


8 Outlook and extensions 


The result that we have presented here is stronger than a weak law of large numbers that was previously 
proved by one of the authors m- Moreover, the present argument is simpler, since the old argument 
used uniqueness theory for a certain type of partial differential equation. The old proof was not direct. 
However, the old result, weak as it is, was useful in a subsequent work by Mueller and one of the 
authors [20]. That was a weak law for the length of the longest increasing subsequence in a Mallows 
distributed random permutation, when q = Qn scales such that 1 — qn ^ /3/n as n —>■ 00, for some 
/3 S K. Bhatnagar and Peled considered the more general case that qn may scale with n in a more 
singular way. It is therefore interesting to look for a more exact type of result than what has been 
presented in this article. 

The following result is true, and we will prove it in some detail in the appendix. 

Lemma 8.1 Let us define {n}! := [n]q\/n\. Let us define 

£ 4 ( 71 ) = {(nil, ni 2 ,7121, 7122 ) G (0,1,... {"^ : nn + ni 2 + n 2 i + n 22 = n} . 

Then for each such f-tuple, defining Pn,/ 3 (nii, ni 2 , n 2 i, n 22 ; ^i, 6 * 2 ) to be 


hin,p ( I ((xi, 2 / 1 ),..., {Xn,yu)) G ([0,1]2)" : i Y G Ws,,e 2 (^, 


k=l 


n n n n 

we have the dependence on fi, versus the usual multinomial formula for P = 0: 
'Pn,p{nii,ni2,n2i,n22\9i,92) = 7^n,o(nii, ni2, n2i, n22; 0i, ^ 2 ) 


X 


(nil + ni2}!{nii + n2i}!{ni2 + n22}!{n2i + n22}! 
{nii}!{ni2}!{n2i}!{n22}!{nii + ni2 + n2i + n22}! 


q=exp{-P/(n-l)) 
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This may be proved using similar symmetries to those already on display here, except at the discrete 
level. 

This leads to a quantitative version of the results on display here, sufficient to establish a local 
central limit theorem. 

But moreover, it may be possible that this applies for q — qn scaling with n in a more singular way 
than 1 — qn ^ c/n. More precisely, Moak has obtained a full asymptotic expansion for the q-factorial 
numbers such as {n}q in m- When q = exp(—/3/(n — 1)) the leading order part does lead to the 
large deviation function formulas we have derived here. But there are also correction terms, and more 
generally the lower order terms may be relevant when q is not scaling as 1 — /3n“^ for some finite 
/3 G R. That might be useful for trying to further analyze models considered by Bhatnagar and Peled 
in [7]. In a result related to |20] . we are considering a 9-square problem, where the middle square is 
small, having linear size on the order of That analysis is currently underway, and we hope to 

report on it soon, in another article. 


A Proof of Lemma 18.1 


Let us begin with a lemma. 

Lemma A.l For each n G N, the polynomial Pn{q) = formula 

fe=i'' 

and for any n G N and A: G {1,..., n} 


1-9 


=: [nj,!. 


E 

7rGSh^,fc 


inv(7r) _ 


[n]g\ 


def 


[k]q\[n-k]ql 


where Sh„_fc is the set of all permutations tt G Sn such that tti < • • • < tt^ and iTk+i < ■ ■ ■ < 7r„. 

Proof: The first formula is well-known. One may consult [9], for instance, and references, therein. 

The second formula is also well-known. But let us review the proof, since we will use the same 
ideas, subsequently. Given tt G S'„, and given fc G {1,..., n}, denote by the permutation such that 


= {7ri,...,7r4 and 




and such that is an element of Sh„^fe. Also, let '> denote the permutation in Sn such that 

_ j fpj. j > and TTj = for j G {1, ..., k}. Similarly, define such that = j 

O’ 

for j < k and tt,- = for j G {fc -|- 1,..., n}. Then it is an easy fact to see that 


inv(7r) = inv(7r^^^)-I-inv(7r*-*^’ ^)-|-inv(7r^^’^^). 


Moreover, defining S~ to be the set of all permutations tt G Sn such that ttj = j for all j > k (which 
is isomorphic to St) and 5 ( 1 “ to be the set of all permutations n G Sn such that ttj = j for all j < k 
(which is isomorphic to Sn-k), we have S'„ = Sh„^fe ^l^n k ^ ^nk’ where the bijection is the mapping 
TT^ (7r('=),7r('=’-),7r('=’+)). Using this, one may easily prove the second formula. □ 


With this, we may give the desired proof. 

Proof of Lemma 18.11 We are going to start with a particular construction of a set of points 
((a:i,yi),..., (a:„, j/„)) such that n~^ YJk=i h^k.Vk) is in W(9i,e2((ny/n)fj=i)- 
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First of all we note that the particular ordering of , (x„,j/„)) does not affect the density 

yn,p{{xi,yi ),..., (x„,t/„)). Therefore, we will only describe the construction modulo a permutation 
in Sn- We return to this point at the end. 

Let us denote 


,(1:1) 


= n 


( 2 : 1 ) 


,( 1 : 1 ) 


,( 2 : 2 ) 


Let us choose points 

0 < < • • • < T (1) 


and 


0 <(<“>< ■■■<(<(),> 


'(i-.j-.l) _ r{i:j) 
k - ^,00 


= ni2 , 

(1:2) (2:1) 

=712 — 

1121, = 

(2:2) 

II 2 = II 22 • 





< 6*1, 

V 

V 

"2 


< 6*2, 

02<tr^<--- 

"2 


, shuffle permutations 


. Then we define 

'^^}, and 

II 

for fc G {1,. 

• ■ ,ii2*'^^}- 




Finally, let us choose permutations G S'„y for each i,j G {1,2}. Then we define Zij C 

Aij(0i,d2) with \Zij\ = Tiij for each i,j G {1,2}, as 


Zii = 


d2:l:l), 

) 

: fc = 1 

,...,nii} 

^12 = 

II 

d2:2:l) 

) 

: fc = 1 

,•■•,1112} 

2'21 = 

= {(¥"■>, 

=(2:1:2) 

) 

: fc = 1 

, •■•,H2i} 

^'22 = 

= {(¥"b 

di 2 -. 2 -. 2 ) 

: fc = 1 

,•■•,1122} 


We let (xi, j/i),..., (x„, yn) be an arbitrary choice of points (meaning an arbitrary ordering) comprising 

UL=i^d- 

For each G {1,2}, let us define 


A((i,j),(/,j)((xi, j/i),..., (x„, j/„)) = #{{k,£} : {xk,yk) & £^i,j , {xi,yi) & ki^j {xk - xe){yk - yt) < (i} ■ 


Then it is easy to see that 
2 

Hn = +-^( 1 , 1 ),( 1 . 2 ) + A/(i^l)y 2 ,l) + W(i_ 2 ),( 2 , 2 ) + W( 2 ,l),( 2 , 2 ) + A/(i^ 2 ),( 2 , 2 ) • 

j,t=l 

We do not need to include Af(i^i).(2,2) because if {xk,yk) G An and {xi,yi) G A22 then we must have 
{xk-xi){yk-yt) > 0 . For similar reasons, we have A/(i_ 2).(2,1)((a;i,2/1), ■■■, (x„,j/„)) = ni2n2i. Finally, 
we claim that careful inspection shows that 

■^(*.i).(M) = inv(7r(*’-i)), for each i,j G {1,2} 

A((i 4 ),(n 2 ) = inv( 7 r(^-^)), A/(i,i)y 2 ,i) = inv( 7 r(^'^)), 

A/(n 2 ).( 2 . 2 ) = inv( 7 r( 2 ' 2 )), A/'( 2 .i).( 2 . 2 ) = inv)?!^^'^)). 
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Then, using the fact that ZniP) = [n\q\/n\ for q = exp(—/3/(n — 1)), we have 

Tl^ 

Mn,/3((a;i,yi),..., (a;n,yn)) = 

Then summing over each of the independent permutations € Sm^ for i^j € {1,2} and € 

Sh^(i.j) for i,j G {1,2}, and using Lemma FA. 1[ we may deduce the result. □ 




inv(7r 


(ij) { 


n< 




B Another derivation of the main results 


We announce here another approach to derive our main results, without providing all the details. 
Firstly, there does exist a g-Stirling’s formula just like the usual Stirling’s formula [19], wherein one 
may see that the exponential term is as in ([S])- In fact, the q-Stirling’s formula may be considered 
to be superior because the dilogarithm arises in the leading-order exponential term rather than just 
in the constant term. (In the usual Stirling’s formula the constant must be derived by one of a 
number of methods. One method is to realize that ln(n!) — (n + ^) ln(n) — n + ^ ln(2e) = where 

Rn —>■ /q ln(7ra:/sin(7ra:)) dx. Modulo elementary functions, this is equal to the dilogarithm function 
at —1, which may be evaluated as easily as it is to calculate C(2)- On the other hand, since ([5|) is 
also another expression for the dilogarithm of an exponential, we see that the dilogarithm arises in the 
leading-order exponential part of the g-Stirling formula.) 

An explicit formulation is as follows: for each /3 G R 


In 



9=exp(-/3/(ri-l)) 






+ Rn{f3) , 


(35) 


where Rn{fi) —0 as n —>■ oo. This follows from Moak’s formula. But also, a simple proof is given (by 
specializing to g’s of the form = 1 — -I- o(l)) where o(l) —>■ 0 as n —>■ oo) in [31]: see Theorem 

5.1.1. 

Using this formula, keeping only the leading-order exponential term, and using Lemma l8.ll we may 
prove that 


lim ( < ((a:i,yi),...,(a:„,?/„)) : - e We^,e2{{^i3)lj=i) 

\ K 

= ^2; j^ll, J^12, i^21, ^^22), 

the formula from Theorem 13.II But this is a harder approach. It is proved in complete detail in m in 
Lemma 5.6.2. The advantage of this second approach is that by keeping track of the lower order terms 
in (I35|) one may prove local limit theorems. We intend to do this in a later paper for the 9-square 
problem, where the middle square has both sidelengths on the order of as a subset of [0,1]^, 

so that there are order points in the middle square. That is because this is what is needed to 
obtain the simplest bounds on the fluctuations for the length of the longest increasing subsequence in 
a Mallows random permutation when qn = 1 — + o(l)). The weak limit of this problem was 

previously considered in [20] . But we will obtain essentially bounds in a paper currently 

in preparation. 
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